Supplementary Materials Overview

This supplement provides detailed methods, additional results, and reproducibility information for the main manuscript. All items are cross-referenced to specific sections of the main text.

Contents:

  • Section 1: Supplementary Methods — Extended protocols and statistical details
  • Section 2: Supplementary Tables (S1–S13) — Full statistical results referenced in main text
    • Table S8: Comprehensive richness-condition diagnostic analysis (sampling artifact tests, key species analysis, taxonomic group effects)
  • Section 3: Supplementary Figures (S1–S17) — Supporting visualizations for main text findings
  • Section 4: Reproducibility Guide — Scripts and data for reproducing all analyses

1 Supplementary Methods

This section provides extended methodological details for readers seeking to understand or replicate our analyses. Each subsection corresponds to a specific aspect of the main text Methods section.

1.1 S1.1 Study Site Details

Main text reference: Methods → Study System (page 3)

Surveys were conducted at three reef sites representing distinct reef environments around Mo’orea, French Polynesia (17°30’S, 149°50’W) during June–August 2019. Sites were selected to capture environmental heterogeneity that might influence cryptofauna community assembly.

Site Code Reef Type Coordinates Depth Environment Key Features
Hauru HAU Fringing -17.4833°, -149.8667° 1.2–4.8 m Moderate exposure High water circulation, diverse coral cover
Maatea MAT Back-reef lagoon -17.5333°, -149.8000° 2.1–5.9 m Sheltered Calm conditions, high sedimentation
Barrier Reef MRB Barrier reef crest -17.4667°, -149.8333° 3.2–7.6 m High energy Oceanic swells, high coral density

Why these sites matter: The three sites represent a gradient from sheltered lagoon (MAT) to exposed barrier reef (MRB), allowing us to test whether cryptofauna assembly rules generalize across environmental contexts or depend on local conditions. This design directly supports our third research question about reef environment effects on community composition.

Reproducibility: Site coordinates and characteristics are in data/survey_coral_characteristics_merged_v2.csv. Map generated by scripts/03_spatial_patterns.R.

1.2 S1.2 Coral Volume Estimation

Main text reference: Methods → Coral and Fauna Sampling (page 3); supports all scaling analyses

Accurate volume estimation is critical because our central hypothesis (propagule dilution) predicts that fauna abundance scales sublinearly with habitat volume.

Volume Calculation Protocol

Colony volume was calculated using the hemi-ellipsoid approximation:

\[V = \frac{2}{3} \pi \times r_1 \times r_2 \times h\]

Where: - \(r_1\) = semi-major axis (half of maximum diameter) - \(r_2\) = semi-minor axis (half of perpendicular diameter) - \(h\) = colony height

Validation: This method was validated against water displacement measurements for 25 Pocillopora colonies (R² = 0.94, slope = 0.98), confirming that the ellipsoid approximation closely matches true volume.

Range of colony sizes: Our sampling captured colonies spanning three orders of magnitude in volume (12–18,400 cm³), providing statistical power to detect scaling relationships.

Reproducibility: Volume calculations in scripts/01_load_clean_data.R (lines 45–60). Raw measurements in data/survey_coral_characteristics_merged_v2.csv.

1.3 S1.3 Fauna Extraction and Identification

Main text reference: Methods → Coral and Fauna Sampling (page 3)

Complete extraction of cryptofauna is essential for accurate community characterization. Our protocol was designed to maximize recovery of mobile and sessile organisms from the complex three-dimensional structure of branching corals.

Extraction Protocol:

  1. Field enclosure: Each coral enclosed in fine-mesh bag (1 mm mesh) while still attached
  2. Removal: Colony carefully detached using hammer and chisel
  3. Agitation sequence:
    • 5 minutes vigorous agitation in 20 L seawater
    • 10 minutes rest (allows mobile fauna to emerge)
    • 5 minutes additional agitation
  4. Manual inspection: Branch interstices examined with forceps
  5. Preservation: All fauna preserved in 95% ethanol

Identification: Specimens sorted to operational taxonomic units (OTUs) based on morphological characters under dissecting microscope (10–40× magnification). The 243 OTUs include 87 identified to species level, with the remainder identified to genus or family.

Limitations: Molecular barcoding was not performed; cryptic species may be conflated within morphological OTUs. However, community-level patterns should be robust to moderate identification uncertainty.

Reproducibility: Fauna data in data/survey_cafi_data_w_taxonomy_summer2019_v5.csv. Taxonomic summaries generated by scripts/02_community_composition.R.

1.4 S1.4 Position Correction for Physiological Measurements

Main text reference: Results → Result 4: Coral Condition Independent of Size (page 6); Methods → Statistical Analyses (implicit)

This methodological innovation addresses a systematic bias in coral physiology sampling that could confound condition–diversity relationships.

The Problem

Physiological measurements were taken from branch tips at varying positions on the colony. Sampling position (measured as “stump length”—distance from colony base to sampling point) correlates strongly with colony size:

  • Correlation: r = 0.565, p < 0.001
  • Implication: Larger colonies were sampled at more distal positions, potentially biasing physiological comparisons

The Solution: Residual Analysis

For each physiological trait (protein, carbohydrate, zooxanthellae density, AFDW):

  1. Fit linear model: trait ~ stump_length
  2. Extract residuals as position-corrected values
  3. Standardize to z-scores for comparability

Condition Score = PC1 of position-corrected traits (~60% variance explained)

Validation: Corrected condition score shows |r| < 0.10 with colony volume (vs. r = 0.42 for uncorrected score), confirming successful removal of size confound.

Why this matters: Without position correction, any relationship between fauna diversity and coral “condition” could simply reflect the shared correlation of both variables with colony size. Our correction isolates true condition variation from size effects.

Reproducibility: Position correction implemented in scripts/05a_coral_characteristics.R (lines 120–180). Validation plots in Figure S3–S4.

1.5 S1.5 Statistical Models

Main text reference: Methods → Statistical Analyses (page 4)

This section provides complete model specifications for readers wishing to replicate our analyses or understand the statistical framework.

1.5.1 Power-Law Scaling Model

Purpose: Test whether fauna abundance scales sublinearly with coral volume (supporting propagule dilution hypothesis).

log10(Abundance) ~ log10(Volume) + Branch_Width + (1 + log10(Volume) | Site)
  • Family: Negative binomial with log link (accounts for overdispersion common in count data)
  • Fixed effects:
    • log10(Volume): scaling exponent β; sublinear if β < 1
    • Branch_Width: binary covariate for coral architecture
  • Random effects: Site-specific intercepts and slopes allow heterogeneous scaling across reef environments
  • Interpretation: If β = 0.49, a 10-fold increase in volume yields only 10^0.49 = 3.1-fold increase in abundance

1.5.2 Community Composition (PERMANOVA)

Purpose: Partition variation in community composition among site, volume, and coral characteristics.

  • Dissimilarity: Bray-Curtis on Hellinger-transformed abundances
  • Transformation: Hellinger reduces influence of rare species while maintaining sensitivity to composition shifts
  • Permutations: 999 permutations for p-value estimation
  • Design: Sequential model testing Site, log(Volume), Branch Width, Depth

1.5.3 Network Analysis

Purpose: Identify non-random species associations and keystone taxa.

  • Network construction: Spearman correlations on presence-absence data
  • Edge threshold: |ρ| > 0.3 and p < 0.05 (Bonferroni-corrected)
  • Modularity: Louvain algorithm; compared to 1,000 degree-preserving null networks
  • Centrality metrics: Degree (number of connections), betweenness (bridging importance), eigenvector (influence)

Reproducibility: All models in scripts/05_coral_cafi_relationships.R (scaling), scripts/04_diversity_analysis.R (PERMANOVA), scripts/06_network_analysis.R (networks).


2 Supplementary Tables

All tables provide detailed statistical results that support findings described in the main text. Each table indicates the specific main text section it supports.

2.1 Table S1: Sampling Summary

Supports: Main text Methods → Study System; provides sample sizes for all analyses

This table summarizes the sampling effort and basic community metrics for each site, providing context for interpreting site-specific results.

Table S1. Sampling summary by site. N = number of colonies; CAFI = coral-associated fauna inventory; Species = operational taxonomic units; Volume = colony volume (mean ± SD).
Site N Corals CAFI Individuals Species (OTUs) Mean Volume (cm³) Depth Range (m)
HAU 38 4,234 62 1,245 ± 892 1.2–4.8
MAT 39 5,102 71 1,567 ± 1,102 2.1–5.9
MRB 35 3,498 58 1,089 ± 756 3.2–7.6
Total 112 12,834 87 unique 1,312 ± 945 1.2–7.6

Reproducibility: Generated by scripts/01_load_clean_data.R. Source data: data/survey_coral_characteristics_merged_v2.csv.

2.2 Table S2: PERMANOVA Results

Supports: Main text Results → Result 1: Reef Environment Structures Community Composition (page 5)

This table presents the full PERMANOVA results showing how much community variation is explained by each factor. The dominant site effect (R² = 0.106) supports our finding that reef environment shapes cryptofauna communities more strongly than coral morphology.

Table S2. PERMANOVA results for CAFI community composition. Analysis based on Bray-Curtis dissimilarity of Hellinger-transformed abundances (999 permutations). Bold indicates p
Term Df SS F p
Site 2 2.847 0.106 4.31 0.001
log(Volume) 1 1.234 0.046 3.74 0.002
Branch Width 1 0.567 0.021 1.72 0.078
Depth 1 0.423 0.016 1.28 0.198
Site × Volume 2 0.312 0.012 0.94 0.456
Residual 104 21.456 0.799
Total 111 26.839 1.000

Interpretation for naive readers: PERMANOVA (Permutational Multivariate Analysis of Variance) partitions variation in community composition among predictor variables. R² indicates the proportion of total variation explained by each factor. Here, Site explains 10.6% of variation—more than any coral-level characteristic—indicating that reef environment filters community composition.

Reproducibility: Generated by scripts/04_diversity_analysis.R (lines 200–250). Output saved to output/tables/permanova_results.csv.

2.3 Table S3: Power-Law Scaling Model

Supports: Main text Results → Result 2: Sublinear Scaling Reveals Propagule Dilution (page 5)

This table presents the full model output for the scaling analysis, including random effects variance. The scaling exponent β = 0.46 is significantly less than 1.0, supporting the propagule dilution hypothesis.

Table S3. Negative binomial GLMM for CAFI abundance ~ log(Volume). The scaling exponent β = 0.46 (95% CI: 0.38–0.53) is significantly less than 1.0, indicating sublinear scaling consistent with propagule dilution. Random slopes (SD = 0.11) indicate modest site-to-site variation in scaling.
Effect Estimate SE 95% CI z / p
Fixed Effects
Intercept 1.234 0.156 [0.93, 1.54] 7.91 / <0.001
log(Volume) 0.458 0.038 [0.38, 0.53] 12.05 / <0.001
Branch Width [wide] 0.287 0.089 [0.11, 0.46] 3.22 / 0.001
Random Effects
Site (Intercept) SD 0.484
Site (Slope) SD 0.110
Residual SD 0.876

Interpretation for naive readers: The scaling exponent (β = 0.46) describes how fauna abundance changes with coral volume. A value of 1.0 would indicate isometric scaling (doubling volume doubles abundance). Our value of 0.46 means fauna density decreases in larger corals—a 10-fold volume increase yields only a 3.1-fold abundance increase—consistent with limited propagule supply being spread across more habitat.

Reproducibility: Generated by scripts/05_coral_cafi_relationships.R (lines 80–150). Model object saved to output/objects/scaling_glmm.rds.

2.4 Table S4: Site-Specific Scaling Exponents

Supports: Main text Results → Result 2, Discussion → Propagule Dilution (pages 5, 7)

This table shows that sublinear scaling is consistent across all three reef sites, strengthening the generality of our findings.

Table S4. Site-specific scaling exponents. All three sites show consistent sublinear scaling (β
Site β (Exponent) SE 95% CI p (β < 1)
HAU (Fringing) 0.52 0.08 [0.36, 0.68] <0.001 0.54
MAT (Lagoon) 0.44 0.07 [0.30, 0.58] <0.001 0.62
MRB (Barrier) 0.50 0.08 [0.34, 0.66] <0.001 0.51
Overall 0.46 0.04 [0.38, 0.53] <0.001 0.58

Reproducibility: Site-specific models in scripts/05_coral_cafi_relationships.R (lines 160–200).

2.5 Table S5: Network Metrics

Supports: Main text Results → Result 3: Network Structure Reveals Non-Random Assembly (page 6)

This table compares observed network properties to null model expectations, demonstrating that cryptofauna co-occurrence networks are significantly more structured than expected by chance.

Table S5. Co-occurrence network metrics vs. null model. Null expectations from 1,000 degree-preserving randomizations. Elevated transitivity (z = 7.85) and modularity (z = 22.0) indicate non-random species associations.
Metric Observed Null (mean ± SD) z-score p Interpretation
Transitivity (clustering) 0.28 0.07 ± 0.02 7.85 <0.0001 3.8× higher clustering
Modularity (Q) 0.52 0.08 ± 0.02 22.0 <0.0001 Non-random modules
Mean path length 2.34 2.89 ± 0.12 -4.58 <0.001 Shorter paths
Number of modules 6 Distinct species groups

Interpretation for naive readers: Transitivity measures how often species that share a common associate also co-occur with each other. Our networks show 3.8× higher transitivity than expected by chance, meaning cryptofauna form cohesive groups of frequently co-occurring species—not random assemblages.

Reproducibility: Network analysis in scripts/06_network_analysis.R. Metrics saved to output/tables/cafi_network_metrics.csv.

2.6 Table S6: Structurally Central Species

Supports: Main text Results → Result 4, Discussion → Network Modules (pages 6, 7-8)

Table S6. Structurally central species by network centrality. Degree = number of co-occurrence partners; Betweenness = importance for connecting network components. Alpheus diadema emerges as the most connected species, bridging multiple modules. Note: high centrality indicates structural importance but does not establish functional ‘keystoneness,’ which would require experimental manipulation.
Rank Species Functional Group Degree Betweenness Module Role
1 Alpheus diadema Snapping shrimp 12 260 5 Hub/Connector
2 Alpheus collumianus Snapping shrimp 8 171 5 Hub
3 Caracanthus maculatus Coral croucher fish 7 232 3 Connector
4 Trapezia serenei Guardian crab 6 45 1 Hub
5 Alpheus lottini Snapping shrimp 6 89 1 Hub
6 Macrophiothrix longipeda Brittle star 5 12 2 Peripheral

Interpretation for naive readers: Structurally central species are those with many connections in the co-occurrence network. Alpheus diadema (snapping shrimp) has the highest “betweenness,” meaning it connects otherwise separate groups of species. However, high network centrality indicates structural importance—not functional “keystoneness,” which would require experimental removal to demonstrate.

Reproducibility: Centrality analysis in scripts/06_network_analysis.R (lines 150–200). Full species table in output/tables/cafi_keystone_species.csv.

2.7 Table S7: Coral Condition Model

Supports: Main text Results → Result 2: No Evidence for Diversity-Condition Relationship (page 6)

Table S7. Coral condition (PC1) as a function of colony characteristics. Condition is independent of colony size and neighborhood, validating its use as an outcome variable in CAFI–condition analyses.
Predictor Estimate SE 95% CI t p
Colony volume (log) −0.02 0.06 [−0.14, 0.10] −0.33 0.74
Neighbor count 0.04 0.04 [−0.03, 0.11] 1.10 0.27
Site (HAU vs MAT) −0.11 0.27 [−0.64, 0.42] −0.41 0.68
Site (MRB vs MAT) 0.19 0.28 [−0.36, 0.74] 0.68 0.50

Reproducibility: Condition models in scripts/05a_coral_characteristics.R and scripts/18_cafi_predicts_condition.R.

2.8 Table S8: Richness–Condition Diagnostic Analysis

Supports: Main text Results → Result 2: No Evidence for Diversity-Condition Relationship (page 6)

This section provides the complete diagnostic analysis demonstrating that the apparent richness-condition relationship is a sampling artifact. When proper statistical corrections are applied, the effect disappears entirely.

2.8.1 S8.1 The Sampling Problem

Raw species richness is confounded with sampling effort because larger corals support more individuals, and more individuals yield more observed species regardless of any ecological relationship. Our diagnostic analysis quantifies this confound:

Table S8a. Sampling artifact diagnostics. The strong abundance-richness correlation (r = 0.81) indicates that most variation in observed richness reflects sampling effort (more individuals → more species detected), not true differences in community diversity.
Metric Value p-value Interpretation
Abundance–Richness correlation r = 0.813 < 0.001 Very strong positive—classic species-area effect
Richness variance explained by abundance R² = 72.3% < 0.001 Most richness variation is sampling effort, not true diversity
Richness–Volume correlation r = 0.668 < 0.001 Larger corals have higher richness (sampling artifact)
Condition–Volume correlation r = −0.042 0.70 Condition independent of volume (good)

2.8.2 S8.2 Corrected Diversity Metrics

We applied four complementary approaches to control for sampling artifacts, all yielding null results:

Table S8b. Diversity-condition relationship under different correction methods. All properly-corrected diversity metrics (bold) show no significant relationship with coral condition. The apparent raw richness effect (p = 0.041) is a methodological artifact.
Diversity Metric Definition β (effect) SE p-value Conclusion
Raw richness (naive) Simple species count +0.058 0.028 0.041 Appears significant (ARTIFACT)
Raw richness + abundance Richness controlling for # individuals +0.069 0.042 0.104 Effect disappears with abundance control
Rarefied richness (n=10) Expected richness at standard sample size −0.011 0.164 0.93 NO EFFECT—sampling artifact removed
Residualized richness Richness independent of abundance (regression residuals) +0.055 0.040 0.17 No pure diversity effect
Evenness (Pielou’s J) Relative abundance distribution (abundance-independent) −2.94 1.80 0.11 No effect independent of richness
Shannon H’ Information-theoretic diversity +0.087 0.364 0.81 No effect

Statistical Details: Rarefaction and Residualization

Rarefaction: We calculated expected richness at a standardized sample size of 10 individuals using vegan::rarefy(). This controls for differential sampling effort by asking: “How many species would we expect if we had sampled exactly 10 individuals from each coral?” Only corals with ≥10 individuals (n = 68, 81% of sample) were included.

Residualization: We regressed raw richness on log(abundance) (R² = 0.72), then used the residuals as a measure of “pure diversity”—variation in richness independent of how many individuals were sampled. Corals with positive residuals have more species than expected for their abundance; negative residuals indicate fewer species than expected.

Evenness: Pielou’s J = H’/log(S), where H’ is Shannon diversity and S is richness. This metric is inherently independent of richness and measures how evenly individuals are distributed among species.

2.8.3 S8.3 Key Species Analysis

To test whether specific taxa drive any relationship, we examined individual species correlations with condition:

Table S8c. Individual species-condition correlations (top 6 by raw p-value). Of 28 species present in ≥10 corals, only 2 showed raw p
Species Prevalence Correlation p (raw) p (FDR) Interpretation
Hapalocarcinus (gall crab) 6 corals r = +0.27 0.013 0.36 Only 2 species p < 0.05 raw
Periclimenes (cleaner shrimp) 16 corals r = +0.23 0.038 0.49 Neither survives FDR correction
Fennera (pistol shrimp) 26 corals r = +0.20 0.062 0.49 Marginal, not significant
Breviturma pica (snail) 20 corals r = +0.20 0.070 0.49 Marginal, not significant
Trapezia tigrina (guard crab) 11 corals r = +0.15 0.160 0.67 Not significant
Calcinus latens (hermit) 25 corals r = −0.15 0.167 0.67 Not significant (negative)

2.8.4 S8.4 Leave-One-Species-Out Analysis

We tested whether removing any single species substantially changed the richness-condition correlation:

  • Baseline correlation: r = 0.139 (raw richness vs. condition)
  • Maximum change from removing any species: Δr = 0.016
  • Conclusion: No single species is critical for the apparent relationship

This confirms that the (non-significant) correlation is distributed across many species rather than driven by any particular taxon.

2.8.5 S8.5 Taxonomic Group Analysis

We tested whether any functional group showed elevated condition in species-rich assemblages:

Table S8d. Group-specific richness-condition correlations. Crab richness shows the only signal (p = 0.014), but this does not survive multiple testing correction. The marginal crab effect may reflect the well-documented benefits of guardian crabs (Trapezia) rather than crab diversity per se.
Functional Group Richness–Condition r p-value Survives FDR? Interpretation
Crabs (Brachyura) +0.27 0.014 No (p = 0.056) Crab richness marginally associated; may reflect guardian crab presence
Shrimp (Caridea) +0.04 0.73 No No effect
Fish +0.07 0.52 No No effect
Snails (Gastropoda) +0.09 0.44 No No effect

2.8.6 Summary: No Robust CAFI-Condition Relationship

The comprehensive diagnostic analysis demonstrates:

  1. Strong sampling confound: 72% of richness variance reflects abundance (sampling effort), not true diversity
  2. Rarefied richness null: When standardized to equal sampling (n=10), richness shows zero correlation with condition (r = −0.01, p = 0.93)
  3. No key species: No individual species drives the relationship; no species survives FDR correction
  4. Composition null: Apparent composition-condition correlation (p = 0.007) was driven by 3 extreme points; robust methods (Spearman, Kendall) showed no relationship

Biological interpretation: These results do not mean cryptofauna provide no benefits to corals—guardian crabs demonstrably protect their hosts in experimental studies. However, our observational data show no detectable relationship between community attributes (diversity or composition) and coral physiological condition.

Reproducibility: Full diagnostic analysis in scripts/richness_condition_diagnostic.R. Output tables: output/tables/richness_condition_diagnostic.csv, output/tables/species_condition_correlations.csv, output/tables/leave_one_species_out.csv.

2.9 Table S9: Alpha Diversity by Site

Supports: Main text Results → Result 1 (page 5); provides diversity context

Table S9. Alpha diversity metrics by site (mean ± SD). MAT (lagoon) shows highest diversity across all metrics. Kruskal-Wallis test confirms significant site differences (p = 0.003).
Site Richness (S) Shannon (H’) Simpson (1-D) Evenness (J’) Test
HAU 14.2 ± 5.1 2.12 ± 0.45 0.81 ± 0.08 0.78 ± 0.12
MAT 16.8 ± 6.2 2.34 ± 0.52 0.84 ± 0.07 0.81 ± 0.10
MRB 12.9 ± 4.8 1.98 ± 0.41 0.79 ± 0.09 0.76 ± 0.13
Overall 14.7 ± 5.5 2.15 ± 0.48 0.81 ± 0.08 0.78 ± 0.12 KW p = 0.003

Reproducibility: Diversity calculations in scripts/04_diversity_analysis.R. Output: output/tables/alpha_diversity_metrics.csv.

2.10 Table S10: Model Comparison

Supports: Main text Methods → Statistical Analyses; validates model selection

Table S10. Model comparison for CAFI abundance (AIC selection). Bold indicates best-supported parsimonious model. R²m = marginal (fixed effects only); R²c = conditional (including random effects). Adding condition provides minimal improvement (ΔAIC = 1.5).
Model Fixed Effects AIC ΔAIC R²m R²c
Volume only 1 876.2 56.4 0.42 0.58
Volume + Site 2 834.5 14.7 0.51 0.64
Volume + Site + Branch Width 3 821.3 1.5 0.55 0.67
Full (+ Condition) 4 819.8 0 0.56 0.68

Reproducibility: Model comparison in scripts/05_coral_cafi_relationships.R (lines 250–300).


3 Supplementary Figures

All figures provide visual support for main text findings. Each figure caption explains what it shows, why it matters, and where to find the corresponding main text discussion.

3.1 Figure S1: Study Site Map

Supports: Main text Methods → Study System (page 3); Figure 1A in main text

**Figure S1. Study sites around Mo'orea, French Polynesia.** Three sites span a gradient from sheltered lagoon (MAT, green) to exposed barrier reef (MRB, purple). HAU (fringing reef, blue) experiences intermediate wave exposure. Inset shows location in the South Pacific. Site selection ensures representation of major reef environments on Mo'orea. **Why this matters**: Environmental heterogeneity among sites tests whether cryptofauna assembly rules generalize or are context-dependent.

Figure S1. Study sites around Mo’orea, French Polynesia. Three sites span a gradient from sheltered lagoon (MAT, green) to exposed barrier reef (MRB, purple). HAU (fringing reef, blue) experiences intermediate wave exposure. Inset shows location in the South Pacific. Site selection ensures representation of major reef environments on Mo’orea. Why this matters: Environmental heterogeneity among sites tests whether cryptofauna assembly rules generalize or are context-dependent.

Reproducibility: scripts/03_spatial_patterns.R (lines 20–80).

3.2 Figure S2: Coral Size Distributions

Supports: Main text Methods → Coral and Fauna Sampling; validates sampling design

**Figure S2. Coral colony volume distributions by site.** Box plots show median, interquartile range, and outliers. All sites span similar volume ranges (approximately 100–10,000 cm³), ensuring that site effects on community composition are not confounded by systematic size differences. **Why this matters**: If sites differed systematically in coral size, apparent site effects could actually reflect size effects.

Figure S2. Coral colony volume distributions by site. Box plots show median, interquartile range, and outliers. All sites span similar volume ranges (approximately 100–10,000 cm³), ensuring that site effects on community composition are not confounded by systematic size differences. Why this matters: If sites differed systematically in coral size, apparent site effects could actually reflect size effects.

Reproducibility: scripts/05a_coral_characteristics.R (lines 50–80).

3.3 Figure S3: Position Correction Validation

Supports: Main text Results → Result 4 (page 6); Supplementary Methods S1.4

**Figure S3. Validation of position correction for physiological measurements.** (A) Stump length (sampling position) increases with colony volume (r = 0.565, p < 0.001), creating a confound. (B) Position-corrected traits show no significant correlation with volume (|r| < 0.10). **Why this matters**: Without this correction, any diversity–condition analysis could be confounded by both variables correlating with colony size.

Figure S3. Validation of position correction for physiological measurements. (A) Stump length (sampling position) increases with colony volume (r = 0.565, p < 0.001), creating a confound. (B) Position-corrected traits show no significant correlation with volume (|r| < 0.10). Why this matters: Without this correction, any diversity–condition analysis could be confounded by both variables correlating with colony size.

Reproducibility: scripts/05a_coral_characteristics.R (lines 120–180).

3.4 Figure S4: Alpha Diversity by Site

Supports: Main text Results → Result 1 (page 5)

**Figure S4. Alpha diversity metrics across sites.** (A) Species richness per coral. (B) Shannon diversity (H'). (C) Simpson diversity (1-D). (D) Pielou's evenness. MAT (lagoon) shows highest diversity across all metrics; MRB (barrier reef) shows lowest. Kruskal-Wallis tests indicate significant site differences for richness and Shannon (p < 0.01). **Why this matters**: Site differences in diversity support the PERMANOVA finding that reef environment structures cryptofauna communities.

Figure S4. Alpha diversity metrics across sites. (A) Species richness per coral. (B) Shannon diversity (H’). (C) Simpson diversity (1-D). (D) Pielou’s evenness. MAT (lagoon) shows highest diversity across all metrics; MRB (barrier reef) shows lowest. Kruskal-Wallis tests indicate significant site differences for richness and Shannon (p < 0.01). Why this matters: Site differences in diversity support the PERMANOVA finding that reef environment structures cryptofauna communities.

Reproducibility: scripts/04_diversity_analysis.R (lines 100–150).

3.5 Figure S5: Rarefaction Curves

Supports: Main text Methods → validates sampling completeness

**Figure S5. Species rarefaction curves by site.** Expected richness as a function of sampling effort (number of individuals). Shaded regions indicate 95% confidence intervals. Curves approach but do not fully reach asymptotes, suggesting some rare species remain unsampled. However, the curves are sufficiently similar to justify between-site comparisons. **Why this matters**: Rarefaction confirms that site differences in diversity are not artifacts of unequal sampling effort.

Figure S5. Species rarefaction curves by site. Expected richness as a function of sampling effort (number of individuals). Shaded regions indicate 95% confidence intervals. Curves approach but do not fully reach asymptotes, suggesting some rare species remain unsampled. However, the curves are sufficiently similar to justify between-site comparisons. Why this matters: Rarefaction confirms that site differences in diversity are not artifacts of unequal sampling effort.

Reproducibility: scripts/04_diversity_analysis.R (lines 160–200).

3.6 Figure S6: NMDS Ordination

Supports: Main text Figure 2A; Results → Result 1 (page 5)

**Figure S6. NMDS ordination of CAFI communities.** Points represent individual corals, colored by site with 95% confidence ellipses. Environmental vectors show significant correlations with ordination axes (p < 0.05). Site separation confirms PERMANOVA finding that reef environment structures community composition (Table S2). Stress = 0.18 indicates adequate ordination quality. **Why this matters**: Visual confirmation that sites support compositionally distinct cryptofauna assemblages.

Figure S6. NMDS ordination of CAFI communities. Points represent individual corals, colored by site with 95% confidence ellipses. Environmental vectors show significant correlations with ordination axes (p < 0.05). Site separation confirms PERMANOVA finding that reef environment structures community composition (Table S2). Stress = 0.18 indicates adequate ordination quality. Why this matters: Visual confirmation that sites support compositionally distinct cryptofauna assemblages.

Reproducibility: scripts/04_diversity_analysis.R (lines 220–280).

3.7 Figure S7: Power-Law Scaling

Supports: Main text Figure 2A; Results → Result 1 (page 5)

**Figure S7. Fauna abundance scales sublinearly with coral volume.** Log-log plot showing CAFI abundance vs. coral volume. Points colored by site; solid line shows GLMM prediction with 95% CI. Dashed line indicates isometric scaling (β = 1). The observed slope (β = 0.46) is significantly less than 1.0, indicating that larger corals harbor proportionally fewer fauna per unit volume—consistent with propagule dilution. **Why this matters**: This is the central result supporting the propagule redirection hypothesis.

Figure S7. Fauna abundance scales sublinearly with coral volume. Log-log plot showing CAFI abundance vs. coral volume. Points colored by site; solid line shows GLMM prediction with 95% CI. Dashed line indicates isometric scaling (β = 1). The observed slope (β = 0.46) is significantly less than 1.0, indicating that larger corals harbor proportionally fewer fauna per unit volume—consistent with propagule dilution. Why this matters: This is the central result supporting the propagule redirection hypothesis.

Reproducibility: scripts/05_coral_cafi_relationships.R (lines 80–150).

3.8 Figure S8: Site-Specific Scaling

Supports: Main text Discussion → Propagule Dilution (page 7); Table S4

**Figure S8. Site-specific scaling relationships.** Separate log-log regressions for each site. All sites show consistent sublinear scaling (β < 1), though MAT (lagoon) shows the tightest relationship (R² = 0.62). Site-specific exponents range from 0.44 (MAT) to 0.52 (HAU), overlapping confidence intervals indicating no significant heterogeneity. **Why this matters**: Confirms that propagule dilution operates consistently across reef environments.

Figure S8. Site-specific scaling relationships. Separate log-log regressions for each site. All sites show consistent sublinear scaling (β < 1), though MAT (lagoon) shows the tightest relationship (R² = 0.62). Site-specific exponents range from 0.44 (MAT) to 0.52 (HAU), overlapping confidence intervals indicating no significant heterogeneity. Why this matters: Confirms that propagule dilution operates consistently across reef environments.

Reproducibility: scripts/05_coral_cafi_relationships.R (lines 160–200).

3.9 Figure S9: Co-occurrence Network

Supports: Main text Figure 4; Results → Result 4 (page 6)

**Figure S9. Species co-occurrence network.** Nodes represent species (sized by abundance, colored by functional group). Edges connect species with significant positive associations (|ρ| > 0.3, p < 0.05). Network shows significant modularity (Q = 0.52 vs. null expectation 0.08, p < 0.0001). The snapping shrimp *Alpheus diadema* (highlighted) emerges as the most connected species. **Why this matters**: Demonstrates that cryptofauna form structured assemblages, not random collections of species.

Figure S9. Species co-occurrence network. Nodes represent species (sized by abundance, colored by functional group). Edges connect species with significant positive associations (|ρ| > 0.3, p < 0.05). Network shows significant modularity (Q = 0.52 vs. null expectation 0.08, p < 0.0001). The snapping shrimp Alpheus diadema (highlighted) emerges as the most connected species. Why this matters: Demonstrates that cryptofauna form structured assemblages, not random collections of species.

Reproducibility: scripts/06_network_analysis.R (lines 50–120).

3.10 Figure S10: Network Modules

Supports: Main text Table 1; Results → Result 3 (page 6)

**Figure S10. Network module structure.** (A) Network colored by module membership (Louvain algorithm). (B) Module composition by taxonomic group. (C) Degree distribution showing approximate scale-free properties. Six modules correspond to functionally coherent species groups (e.g., Module 1 = guardian crabs and associated shrimp; Module 2 = brittle stars). **Why this matters**: Modular structure suggests ecological organization—species within modules may share habitat requirements or facilitate each other's presence.

Figure S10. Network module structure. (A) Network colored by module membership (Louvain algorithm). (B) Module composition by taxonomic group. (C) Degree distribution showing approximate scale-free properties. Six modules correspond to functionally coherent species groups (e.g., Module 1 = guardian crabs and associated shrimp; Module 2 = brittle stars). Why this matters: Modular structure suggests ecological organization—species within modules may share habitat requirements or facilitate each other’s presence.

Reproducibility: scripts/06_network_analysis.R (lines 130–180).

3.11 Figure S11: Richness–Condition Diagnostic Visualization

Supports: Main text Results → Result 2: No Evidence for Diversity-Condition Relationship (page 6); Table S8

**Figure S11. Raw vs. corrected diversity-condition relationships.** Total CAFI abundance vs. coral condition score. While raw richness shows an apparent positive relationship when controlling only for volume (β = 0.058, p = 0.041), this effect is a sampling artifact: larger corals yield more individuals, inflating observed richness. When properly corrected (rarefaction, residualization, evenness), all diversity metrics show no significant relationship with condition (see Table S8b). Points colored by site. **Why this matters**: Demonstrates how species-area sampling artifacts can produce spurious diversity-function relationships.

Figure S11. Raw vs. corrected diversity-condition relationships. Total CAFI abundance vs. coral condition score. While raw richness shows an apparent positive relationship when controlling only for volume (β = 0.058, p = 0.041), this effect is a sampling artifact: larger corals yield more individuals, inflating observed richness. When properly corrected (rarefaction, residualization, evenness), all diversity metrics show no significant relationship with condition (see Table S8b). Points colored by site. Why this matters: Demonstrates how species-area sampling artifacts can produce spurious diversity-function relationships.

Reproducibility: scripts/18_cafi_predicts_condition.R and scripts/richness_condition_diagnostic.R. See Table S8 for complete diagnostic analysis.

3.12 Figure S12: Taxonomic Composition

Supports: Main text Methods → Fauna Sampling (page 3); provides community context

**Figure S12. Taxonomic composition of CAFI communities.** (A) Overall composition showing dominance of decapod crustaceans (crabs + shrimp = 78% of individuals). (B) Taxonomic composition varies by site: HAU has more diverse gastropod/polychaete fauna; MAT is dominated by trapezid crabs; MRB shows elevated shrimp diversity. **Why this matters**: Context for interpreting which functional groups drive community patterns.

Figure S12. Taxonomic composition of CAFI communities. (A) Overall composition showing dominance of decapod crustaceans (crabs + shrimp = 78% of individuals). (B) Taxonomic composition varies by site: HAU has more diverse gastropod/polychaete fauna; MAT is dominated by trapezid crabs; MRB shows elevated shrimp diversity. Why this matters: Context for interpreting which functional groups drive community patterns.

Reproducibility: scripts/02_community_composition.R (lines 50–100).

3.13 Figure S13: Community Composition Changes with Coral Size

Supports: Main text Results → Result 2 (Sublinear Scaling); extends to compositional turnover

**Figure S13. Community composition shifts with coral size.** (A) Stacked bar chart showing proportional abundance of major taxonomic groups across coral volume bins. Crabs dominate small corals while fish and snails increase in larger corals. (B) Proportional abundance of key taxa vs. coral volume on a log scale. Crabs show strong negative correlation (r = -0.56), while fish (r = +0.32) and snails (r = +0.28) increase with size. **Why this matters**: Beyond abundance scaling (Result 2), composition also shifts—larger corals support different assemblages dominated by larger-bodied taxa (fish, snails) rather than small crabs.

Figure S13. Community composition shifts with coral size. (A) Stacked bar chart showing proportional abundance of major taxonomic groups across coral volume bins. Crabs dominate small corals while fish and snails increase in larger corals. (B) Proportional abundance of key taxa vs. coral volume on a log scale. Crabs show strong negative correlation (r = -0.56), while fish (r = +0.32) and snails (r = +0.28) increase with size. Why this matters: Beyond abundance scaling (Result 2), composition also shifts—larger corals support different assemblages dominated by larger-bodied taxa (fish, snails) rather than small crabs.

Reproducibility: scripts/generate_composition_size_figure.R. Full 4-panel version: output/figures/composition/composition_by_size.png.

3.14 Figure S14: Community Composition and Neighborhood Context

Supports: Main text Results → Result 4 (Coral Condition); examines neighborhood effects on composition

**Figure S14. Community composition varies with coral neighborhood.** (A) Proportional composition by number of neighboring corals (within 5m radius). Corals in denser neighborhoods show slightly higher shrimp proportions. (B) Proportional abundance vs. neighbor count for key taxa. Shrimp increase modestly with more neighbors (r = 0.33), while snails decrease (r = -0.22). However, these patterns are weaker than size effects (Figure S13). **Why this matters**: Neighborhood context has minimal effect on composition compared to coral size, consistent with the independence of condition from neighborhood (Result 4).

Figure S14. Community composition varies with coral neighborhood. (A) Proportional composition by number of neighboring corals (within 5m radius). Corals in denser neighborhoods show slightly higher shrimp proportions. (B) Proportional abundance vs. neighbor count for key taxa. Shrimp increase modestly with more neighbors (r = 0.33), while snails decrease (r = -0.22). However, these patterns are weaker than size effects (Figure S13). Why this matters: Neighborhood context has minimal effect on composition compared to coral size, consistent with the independence of condition from neighborhood (Result 4).

Reproducibility: scripts/generate_composition_neighborhood_figure.R. Full 4-panel version: output/figures/composition/composition_by_neighborhood.png.

3.15 Figure S15: Multivariate Composition Analysis (db-RDA)

Supports: Main text Results → integrates size and neighborhood effects on composition

**Figure S15. Distance-based Redundancy Analysis (db-RDA) of CAFI composition.** (A) RDA biplot showing coral communities constrained by size and neighborhood variables. Points colored by coral volume; red arrows show environmental vectors. The log(Volume) vector is longest, indicating coral size is the dominant predictor. (B) Marginal significance of each predictor (Type III tests). Only coral size approaches significance (F = 1.64, p = 0.15); neighborhood metrics (# neighbors, mean distance, neighbor volume) contribute essentially no unique explanatory power. (C) Variance partitioning: coral size explains 1.8% unique variance; neighborhood metrics explain 0% unique (actually slightly negative after adjustment). (D–F) Correlations of RDA axis 1 with each predictor confirm that the primary compositional gradient aligns with coral size (r = 0.50, p < 0.001). **Why this matters**: Multivariate analysis confirms that coral size is the only meaningful predictor of community composition—neighborhood context adds no explanatory power.

Figure S15. Distance-based Redundancy Analysis (db-RDA) of CAFI composition. (A) RDA biplot showing coral communities constrained by size and neighborhood variables. Points colored by coral volume; red arrows show environmental vectors. The log(Volume) vector is longest, indicating coral size is the dominant predictor. (B) Marginal significance of each predictor (Type III tests). Only coral size approaches significance (F = 1.64, p = 0.15); neighborhood metrics (# neighbors, mean distance, neighbor volume) contribute essentially no unique explanatory power. (C) Variance partitioning: coral size explains 1.8% unique variance; neighborhood metrics explain 0% unique (actually slightly negative after adjustment). (D–F) Correlations of RDA axis 1 with each predictor confirm that the primary compositional gradient aligns with coral size (r = 0.50, p < 0.001). Why this matters: Multivariate analysis confirms that coral size is the only meaningful predictor of community composition—neighborhood context adds no explanatory power.

Statistical Details: db-RDA Analysis

  • Distance metric: Jaccard dissimilarity on presence/absence data (emphasizes compositional differences)
  • Method: Distance-based RDA via capscale() in vegan package
  • Predictors: log(coral volume), number of neighbors, mean neighbor distance, log(total neighbor volume)
  • Model selection: Forward selection retained only log(volume); backward elimination confirmed
  • Variance explained: Full model explains 11.1% (not significant at p = 0.245 with Jaccard; significant with Bray-Curtis on abundances)
  • Interpretation: The difference between Jaccard (presence/absence) and Bray-Curtis (abundance) results indicates that size effects manifest primarily through changes in relative abundances rather than complete species turnover

Reproducibility: scripts/generate_composition_rda_figure.R. Simplified manuscript version: output/figures/manuscript/Figure8_rda_composition.png.

3.16 Table S11: Predictor Significance for Community Composition

Supports: Figures S13–S15; quantifies relative importance of size vs. neighborhood

Table S11. Predictor significance for CAFI composition (db-RDA with Jaccard dissimilarity). Marginal (Type III) tests show unique contribution after controlling for other predictors. Only coral size shows substantial association with community composition. Asterisks indicate significance: * p
Predictor F-statistic p-value Unique Variance RDA1 Correlation Interpretation
Coral Size (log vol) 1.64 0.15 1.8% r = 0.50* Primary driver
# Neighbors 0.66 0.61 ~0% r = -0.31* Weak (confounded w/ size)
Mean Distance 1.08 0.37 0.1% r = 0.09 Negligible
Neighbor Volume (log) 0.88 0.47 ~0% r = -0.16 Negligible

Reproducibility: Results from scripts/generate_composition_rda_figure.R.

3.17 Table S12: Species Associations with Coral Size Gradient

Supports: Figures S13 and S15; identifies which taxa drive size-composition relationship

Table S12. Taxon associations with coral size. RDA1 scores from db-RDA analysis; positive = associated with larger corals. Univariate correlations (r) between proportional abundance and log(volume). Crabs show strongest decline in larger corals (r = -0.56), while fish and snails increase. Ecological interpretation: Larger corals support larger-bodied taxa (fish, gastropods) while small, fast-colonizing crabs dominate smaller corals.
Taxon RDA1 Score Univariate r Size Association Biological Interpretation
Snail +0.71 +0.28** Large corals Larger body size, space needs
Fish +0.43 +0.32** Large corals Territorial behavior
Hermit +0.19 +0.15 Large corals Shell availability in large corals
Crab 0.00 -0.56* Small corals Outcompeted by fish in large corals?
Shrimp 0.00 -0.12 Neutral Ubiquitous
Other -0.01 +0.08 Neutral Mixed taxa
Echinoderm -0.27 -0.18 Small corals Prefer cryptic small spaces?

Reproducibility: Correlations from scripts/generate_composition_size_figure.R; RDA scores from scripts/generate_composition_rda_figure.R.

3.18 Table S13: Comprehensive Neighborhood Effects on CAFI Abundance

Supports: Main text Results → Result 2; Table 2 in main text; Figure 3B–D

This table provides the complete statistical output for all neighborhood metrics tested as predictors of CAFI abundance, controlling for coral volume. These results support the main text finding that coral size dominates and neighborhood context is negligible.

Table S13. Comprehensive neighborhood effects on CAFI abundance. All models control for log(coral volume). Bold p-values indicate statistical significance (p
Metric Definition β (Effect) SE p-value Unique R² Direction Interpretation
Neighbor count # corals within 5m radius +0.003 0.002 0.057 <0.1% Weak positive Marginal spillover effect; biologically trivial
Neighbor volume (log) Total cm³ of neighboring corals −0.074 0.018 <0.001 <0.5% Negative Confounded with coral size; large neighbors = large focal corals
Isolation index Mean distance / coral size^(1/3) +0.043 0.14 0.76 ~0% None No propagule redirection at meter scales
Relative size (log) Focal volume / mean neighbor volume +0.020 0.005 <0.001 <0.5% Positive Competitive asymmetry; ‘big fish in small pond’
Spillover potential (log) Neighbor volume / mean distance −0.079 0.018 <0.001 <0.5% Negative Opposite of facilitation prediction

Statistical Details: Neighborhood Analysis

  • Survey protocols: Colonies were sampled using two complementary protocols: (1) “neighborhood surveys” (n = 63) where we recorded the number, distance, and volume of all Pocillopora neighbors within a 5-meter radius of the focal colony; and (2) “size surveys” (n = 51) focused on capturing the full range of coral sizes without neighborhood characterization. This dual-protocol design maximized both the size range for scaling analyses (all 114 colonies) and the sample size for neighborhood effects (63 colonies with complete spatial context).
  • Sample size for neighborhood analyses: n = 60–63 corals with complete neighborhood data (depending on metric completeness)
  • Neighborhood radius: 5 meters from focal coral center
  • Model structure: CAFI_abundance ~ neighborhood_metric + log(volume) (Poisson GLM)
  • Collinearity: Neighbor count, volume, and spillover potential are moderately correlated (r = 0.4–0.6); VIF < 3 in full models
  • Theoretical predictions:
    • Propagule redirection predicts positive isolation effect (isolated corals intercept more larvae/area)
    • Spillover/facilitation predicts positive neighbor density/volume effects
    • Competition predicts negative density effects
  • Observed pattern: Results do not clearly support any single mechanism; coral size dominates

Reproducibility: Analysis in scripts/14_local_neighborhood_effects.R and scripts/Fig6_comprehensive_neighborhood_effects.R. Results saved to output/objects/H3_neighborhood_results.rds.

3.19 Figure S16: Comprehensive Neighborhood Effects Panel

Supports: Main text Figure 2B–D; Table S13

**Figure S16. Comprehensive local neighborhood effects on CAFI communities.** Six-panel figure showing meter-scale spatial effects. (A) CAFI abundance vs number of neighboring corals—weak positive trend. (B) CAFI vs total neighbor volume—negative relationship (confounded with coral size). (C) CAFI vs isolation index—no significant effect. (D) CAFI vs relative size—modest positive effect for corals larger than neighbors. (E) CAFI vs spillover potential—negative relationship, opposite facilitation prediction. (F) Summary by neighbor density category. **Key finding**: After controlling for coral size, neighborhood metrics collectively explain <1% of variance in CAFI abundance. Coral size is the dominant predictor.

Figure S16. Comprehensive local neighborhood effects on CAFI communities. Six-panel figure showing meter-scale spatial effects. (A) CAFI abundance vs number of neighboring corals—weak positive trend. (B) CAFI vs total neighbor volume—negative relationship (confounded with coral size). (C) CAFI vs isolation index—no significant effect. (D) CAFI vs relative size—modest positive effect for corals larger than neighbors. (E) CAFI vs spillover potential—negative relationship, opposite facilitation prediction. (F) Summary by neighbor density category. Key finding: After controlling for coral size, neighborhood metrics collectively explain <1% of variance in CAFI abundance. Coral size is the dominant predictor.

Reproducibility: scripts/Fig6_comprehensive_neighborhood_effects.R.


4 Supplementary Results: No CAFI-Condition Relationship

This section presents the detailed analysis demonstrating that neither diversity nor composition robustly predicts coral physiological condition, supporting Key Finding 2 in the main text.

4.1 Figure S17: Community Composition Does Not Robustly Predict Coral Condition

Supports: Main text Results → Result 2: No Evidence for Diversity-Condition Relationship

**Figure S17. Community composition does not robustly predict coral condition.** CAFI Community PC1 (x-axis) vs. Coral Condition Score PC1 (y-axis). Points colored by site. While the Pearson correlation appears significant (r = −0.29, p = 0.007), this result is driven entirely by 3 extreme observations. Robust correlation methods show no relationship: Spearman ρ = −0.06 (p = 0.61), Kendall τ = −0.03 (p = 0.67).

Figure S17. Community composition does not robustly predict coral condition. CAFI Community PC1 (x-axis) vs. Coral Condition Score PC1 (y-axis). Points colored by site. While the Pearson correlation appears significant (r = −0.29, p = 0.007), this result is driven entirely by 3 extreme observations. Robust correlation methods show no relationship: Spearman ρ = −0.06 (p = 0.61), Kendall τ = −0.03 (p = 0.67).

We tested whether community composition—independent of diversity—predicted coral condition. A PCA on the species abundance matrix yielded an apparent negative relationship between CAFI PC1 and coral condition (Pearson r = −0.29, p = 0.007). However, this result was driven entirely by 3 extreme observations (>2 SD on either axis), including one coral (HAU-POC32) with an extreme PC1 value 8.5 standard deviations below the mean.

4.1.1 Robust Correlation Methods Show No Relationship

When we applied robust correlation methods insensitive to outliers: - Spearman’s ρ = −0.06 (p = 0.61) - Kendall’s τ = −0.03 (p = 0.67) - Excluding 3 extreme points: R² = 0.004 (p = 0.57)

No individual species survived FDR correction for multiple testing (all q > 0.35). We therefore conclude that community composition, like diversity, shows no robust relationship with coral condition in this dataset.

Reproducibility: Analysis in scripts/18_cafi_predicts_condition.R. Figure: output/figures/cafi_predicts_condition/community_pc1_vs_condition_pc1.png.


5 Reproducibility Guide

This section provides a complete guide to reproducing all analyses and figures in the manuscript.

5.1 Data Files

All data files are in the data/ directory:

File Description Rows Main Variables
survey_cafi_data_w_taxonomy_summer2019_v5.csv Individual fauna records 3,989 coral_id, species, count
survey_coral_characteristics_merged_v2.csv Coral colony data 114 coral_id, site, volume, GPS
survey_master_phys_data_v3.csv Coral physiology 108 coral_id, protein, zoox_density

5.2 Analysis Scripts

Scripts in scripts/ are numbered for sequential execution:

Script Purpose Key Outputs Main Text Section
01_load_clean_data.R Data import and cleaning Master dataset Methods
02_community_composition.R Taxonomic summaries Species tables Methods
03_spatial_patterns.R Site maps Figure S1 Methods
04_diversity_analysis.R PERMANOVA, NMDS, diversity Tables S2, S9; Figures S4–S6 Result 1
05_coral_cafi_relationships.R Scaling analyses Tables S3–S4; Figures S7–S8 Result 2
05a_coral_characteristics.R Position correction, condition Table S7; Figure S3 Result 1
06_network_analysis.R Co-occurrence networks Tables S5–S6; Figures S9–S10 Result 4
18_cafi_predicts_condition.R Richness–condition diagnostic Table S8; Figure S11 Result 2

5.3 Reproducing the Analysis

# 1. Set working directory to project root
setwd("/path/to/CAFI-Survey-2026")

# 2. Run master script (executes all analyses)
source("scripts/run_all_survey_analyses.R")

# 3. Render documents
rmarkdown::render("output/manuscript/MANUSCRIPT.Rmd")
rmarkdown::render("output/manuscript/SUPPLEMENTARY_MATERIALS.Rmd")

5.4 Software Requirements

  • R version: 4.3.0 or later
  • Key packages: tidyverse, vegan, lme4, lmerTest, igraph, sf, patchwork, kableExtra

5.5 Output Files

All outputs are in output/:

  • tables/ — CSV files of statistical results
  • figures/ — PNG files at 300 DPI
  • objects/ — RDS files of fitted models
  • manuscript/ — Rendered HTML documents

6 Data Availability

All data and code are publicly available:

For questions about data or analysis, contact: Adrian Stier ()


Supplementary Materials for: Stier et al. “Sublinear scaling and modular network structure reveal assembly rules for coral-associated cryptofauna”